When data is inserted into, deleted from, or updated in a SQL Server table, the indexes defined on that table are automatically updated to reflect those changes. As the indexes are modified, the information stored in them becomes fragmented, resulting in the information being scattered across the data files. When this occurs, the logical ordering of the data no longer matches the physical ordering, which can lead to a deterioration of query performance.
To fix this problem, indexes must be periodically reorganized or rebuilt (defragmented) so the physical order of the leaf-level pages matches the logical order of the leaf nodes. This means that you should analyze your indexes periodically to determine whether they’ve become fragmented and the extent of that fragmentation. From there, you can either reorganize or rebuild the affected indexes, depending on the results of your analysis. In this article, I explain how to analyze SQL Server indexes and how reorganize and rebuild them.
Note: Some sources distinguish between internal and external fragmentation. Internal fragmentation indicates that there is too much free space on the index page. External fragmentation indicates that the logical ordering and physical ordering do not match. Microsoft documentation doesn’t make this distinction and simply provides guidelines for when to defragment indexes and which type of defragmentation to perform. For this article, I follow the Microsoft guidelines.
Analyzing Fragmentation
To analyze SQL Server 2005 or 2008 indexes, you use the system function sys.dm_db_index_physical_stats
to determine which indexes are fragmented and the extent of that fragmentation. You can use the function to analyze all the indexes in an instance of SQL Server 2005 or 2008, all indexes in a database, all indexes defined on a table, or a specific index. You can also analyze an index based on the partition number of the indexed object.
The sys.dm_db_index_physical_stats
function takes the following parameters (in the order specified):
- Database ID: A smallint value that represents the ID number of a database. If null is specified, the function retrieves index-related data from all databases on a SQL Server instance. If you specify null, you must also specify null for the object ID, index ID, and partition number.
- Object ID: An int value that represents the ID number of a table or view. If null is specified, the function retrieves index-related data for all tables and views in a specific database or SQL Server instance. If you specify null, you must also specify null for the index ID and partition number.
- Index ID: An int value that represents the ID number of an index. If null is specified, the function retrieves index-related data for all indexes defined on the specified table or view. If you specify null, you must also specify null for the partition number. Also, if the object ID refers to a heap, use 0 as the index ID.
- Partition number: An int value that represents the partition number of an index or heap. If null is specified, the function retrieves index-related information for all partitions related to a specific object.
- Mode: The scan level used to obtain index-related information. Valid inputs include NULL, DEFAULT, or one of the following three modes:
-
- LIMITED: Scans the smallest number of pages, which means this is the fastest mode. The LIMITED mode is equivalent to NULL and DEFAULT.
- SAMPLED: Scans 1% of all pages. If an index contains fewer than 10,000 pages, then DETAILED mode is used.
- DETAILED: Scans all index pages, which means this is the slowest mode, but most accurate.
You must specify all five parameters, even if their values are null.
The sys.dm_db_index_physical_stats
function returns a number of values that provide details about the indexes you specify. The topic “sys.dm_db_index_physical_stats
” in SQL Server Books Online provides details about each of these values. However, several values are worth noting when analyzing an index:
- avg_fragmentation_in_percent: Percentage of the logical index that is fragmented.
- fragment_count: Number of fragments in the leaf level.
- avg_fragment_size_in_pages: Average number of pages in a leaf-level fragment.
- page_count: Number of index or data pages.
An index always has at least one fragment (fragment_count). The maximum number of fragments that an index can have is equal to the number of pages (page_count). For example, an index that is made up of five pages can at the most have five fragments. The larger the fragment, the less disk I/O that is required. So a five-page index with one fragment requires less disk I/O than the index with five fragments. Ideally, the avg_fragmentation_in_percent value should be as close to zero as possible, and the avg_fragment_size_in_pages should be as high as possible.
Based on your index analysis, you can determine what action to take. Microsoft recommends that you reorganize your index if the avg_fragmentation_in_percent value is less than or equal to 30% and rebuild the index if the value is greater than 30%. (Reorganizing and rebuilding indexes are described in the following sections.)
Note: The topic “sys.dm_db_index_physical_stats
” in SQL Server Books Online includes a script that automatically analyzes and then, based on that analysis, reorganizes or rebuilds your indexes according to the Microsoft guidelines. You can modify the script as necessary to meet your administrative needs.
Keep in mind that these recommendations are guidelines only. A fragmented index (especially a low percentage) is not always enough of a reason to reorganize or rebuild your index. If your queries do not regularly involve table scans as a result of singleton lookups, defragmenting the index might have no effect on performance. In addition, for smaller indexes with relatively few pages and small amounts of data, you might see little to no improvement when you defragment the index. FILLFACTOR
settings can also affect the types of improvements you see.
That said, you should still analyze your indexes regularly, and the sys.dm_db_index_physical_stats
function is the best tool to use. So let’s take a look at an example of how to use the function to retrieve index-related statistics. In the following SELECT statement, I retrieve index data from the AdventureWorks database:
1 2 3 4 5 6 7 8 9 10 |
SELECT object_id AS ObjectID, index_id AS IndexID, avg_fragmentation_in_percent AS PercentFragment, fragment_count AS TotalFrags, avg_fragment_size_in_pages AS PagesPerFrag, page_count AS NumPages FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'), NULL, NULL, NULL , 'DETAILED') WHERE avg_fragmentation_in_percent > 0 ORDER BY ObjectID, IndexID |
As you can see, I use the sys.dm_db_index_physical_stats
function to retrieve the four values described above, plus the object_id and index_id values. Notice that I pull back only those indexes with an avg_fragmentation_in_percent value greater than zero and I order the results by object ID and index ID. That way, the indexes for each table and view are grouped together. Notice also that I use the DB_ID() function to retrieve the ID for the AdventureWorks database. Table 1 shows a partial list of the results that are returned from my local instance of SQL Server 2005. (I also ran the SELECT statement against an instance of SQL Server 2008 and received similar results.)
ObjectID |
IndexID |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
18099105 |
2 |
50 |
2 |
1 |
2 |
30623152 |
1 |
20 |
2 |
2.5 |
5 |
30623152 |
2 |
66.6666666666667 |
3 |
1 |
3 |
30623152 |
3 |
50 |
2 |
1 |
2 |
30623152 |
4 |
50 |
2 |
1 |
2 |
53575229 |
4 |
7.40740740740741 |
3 |
9 |
27 |
66099276 |
1 |
50 |
2 |
1 |
2 |
78623323 |
1 |
0.380710659898477 |
29 |
27.1724137931034 |
788 |
78623323 |
2 |
1.91082802547771 |
8 |
19.625 |
157 |
78623323 |
3 |
1.41509433962264 |
13 |
16.3076923076923 |
212 |
158623608 |
1 |
0.483870967741936 |
31 |
20 |
620 |
158623608 |
2 |
7.31707317073171 |
14 |
8.78571428571429 |
123 |
158623608 |
3 |
1.78571428571429 |
10 |
16.8 |
168 |
181575685 |
1 |
15 |
4 |
5 |
20 |
181575685 |
2 |
33.3333333333333 |
5 |
1.8 |
9 |
181575685 |
3 |
30 |
4 |
2.5 |
10 |
270624007 |
1 |
50 |
2 |
1 |
2 |
279672044 |
1 |
98.5714285714286 |
70 |
1 |
70 |
306100131 |
2 |
17.6470588235294 |
4 |
4.25 |
17 |
309576141 |
1 |
0.536672629695885 |
14 |
39.9285714285714 |
559 |
Table 1: Partial List of Index Statistics for the AdventureWorks Database
Each row in Table 1 represents a specific index whose avg_fragmentation_in_percent
value is greater than zero. As you can see, the statistics can vary greatly, but every index is fragmented to a certain degree.
As I mentioned, Table 1 shows only a partial list of indexes, but the number of indexes can grow quite large. As a result, you might want to narrow down the result set. In the following example, I specify the Sales.StoreContact
table in the AdventureWorks database so that information related only to that table’s indexes is returned:
1 2 3 4 5 6 7 8 9 10 11 12 |
SELECT b.name AS IndexName, a.avg_fragmentation_in_percent AS PercentFragment, a.fragment_count AS TotalFrags, a.avg_fragment_size_in_pages AS PagesPerFrag, a.page_count AS NumPages FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'), OBJECT_ID('Sales.StoreContact'), NULL, NULL , 'DETAILED') AS a JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id WHERE a.avg_fragmentation_in_percent > 0 ORDER BY IndexName |
Notice that I use the OBJECT_ID()
function to return the object ID of the StoreContact
table. Also notice that I join the sys.dm_db_index_physical_stats
function to the sys.indexes
table to retrieve the name of the indexes. Now the results are far easier to read, as you can see in Table 2.
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
AK_StoreContact_rowguid |
66.6666666666667 |
3 |
1 |
3 |
IX_StoreContact_ContactID |
50 |
2 |
1 |
2 |
IX_StoreContact_ContactTypeID |
50 |
2 |
1 |
2 |
PK_StoreContact_CustomerID_ContactID |
20 |
2 |
2.5 |
5 |
Table 2: List of Index Statistics for the Sales.StoreContact Table in the AdventureWorks Database
As Table 2 indicates, the StoreContact
table contains four indexes whose avg_fragmentation_in_percent value is greater than zero. Based on Microsoft’s guidelines, I might choose to reorganize the PK_StoreContact_CustomerID_ContactID
index and rebuild the others.
However, before we proceed to that process, let’s look at one more example. In the following SELECT statement, I return statistics on a specific index in the StoreContact
table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
DECLARE @IndexName VARCHAR(100) SET @IndexName = 'PK_StoreContact_CustomerID_ContactID' DECLARE @IndexID SMALLINT SET @IndexID = (SELECT index_id FROM sys.indexes WHERE name = @IndexName) SELECT @IndexName AS IndexName, avg_fragmentation_in_percent AS PercentFragment, fragment_count AS TotalFrags, avg_fragment_size_in_pages AS PagesPerFrag, page_count AS NumPages FROM sys.dm_db_index_physical_stats(DB_ID('AdventureWorks'), OBJECT_ID('Sales.StoreContact'), @IndexID, NULL , 'DETAILED') WHERE avg_fragmentation_in_percent > 0 |
In this case, I declare variables to pass in the index name and to retrieve the index ID. I then use those variables in the SELECT statement to retrieve the necessary data. Table 3 shows the information returned by this statement:
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
PK_StoreContact_CustomerID_ContactID |
20 |
2 |
2.5 |
5 |
Table 3: Statistics on the PK_StoreContact_CustomerID_ContactID
Index
As you would expect, the data matches the data shown in Table 2 for this index. From this information, you can see that 20% of the logical index is fragmented (avg_fragmentation_in_percent = 20), the index contains two fragments (fragment_count = 2) that are spread across five pages (page_count = 5), for an average of 2.5 pages per fragment (avg_fragment_size_in_pages = 2.5). Based on the avg_fragmentation_in_percent value of 20%, the index should be reorganized, rather the rebuilt.
Reorganizing Indexes
When you reorganize an index, SQL Server physically reorders the leaf-level pages to match the logical order of the leaf nodes. The process uses the existing pages only and does not allocate new ones, but it does compact the index pages. In addition, reorganization uses minimal resources and is automatically performed online, without blocking queries or updates. You should reorganize indexes only if they’re lightly fragmented, otherwise, you should rebuild them.
To reorganize an index, run an ALTER INDEX
statement and include the keyword REORGANIZE
, as shown in the following example:
1 2 3 |
ALTER INDEX PK_StoreContact_CustomerID_ContactID ON AdventureWorks.Sales.StoreContact REORGANIZE |
Notice that I specify the index name and the table. After I run this statement, I then use the sys.dm_db_index_physical_stats
function to retrieve index-related statistics. The results are shown in Table 4.
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
PK_StoreContact_CustomerID_ContactID |
20 |
2 |
2.5 |
5 |
Table 4: Statistics on the PK_StoreContact_CustomerID_ContactID
Index after Reorganization
If you compare these results to those shown Table 3, you’ll see that nothing has changed. When you try to reorganize or rebuild an index with few pages, you’ll often see no benefit. However, when I reorganized the PK_Contact_ContactID
index on the Person.Contact
table, I reduced the number of fragments on 559 pages from 14 to 9 and lowered the percentage of fragmentation by about one-third.
Rebuilding Indexes
Rebuilding an index is generally more effective than reorganizing it because it drops the original index and builds a new one. As a result, the index starts clean with minimal fragmentation and the pages are compacted, and new pages are allocated as needed. In addition, you can choose to rebuild an index offline (the default) or online.
Rebuilding an index can be as simple as reorganizing one. Once again, use the ALTER INDEX
statement, but specify the REBUILD
keyword, rather than REORGANIZE
, as shown in the following example:
1 2 3 |
ALTER INDEX IX_StoreContact_ContactID ON AdventureWorks.Sales.StoreContact REBUILD |
In this case, I’m rebuilding the IX_StoreContact_ContactID
index. When I pull statistics on the rebuilt index, I get the results shown Table 5.
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
|
50 |
2 |
1 |
2 |
Table 5: Statistics on the IX_StoreContact_ContactID
Index after Rebuilding
If you compare the results shown in Table 5 to Table 2, you’ll see that once again, nothing has changed. As with reorganizing an index, you’ll often find that rebuilding an index with few pages will result in little to no benefit. When I rebuilt the PK_Individual_CustomerID
index on the Sales.Individual table, I reduced the number of pages from 3082 to 3081 and the number of fragments from 71 to 18. I also reduced the percentage of fragmentation by about 80%. Note, however, that the reason I didn’t use this index originally is because the percentage of fragmentation was already very low. As I mentioned above, the benefit you’ll see from rebuilding (or reorganizing) an index depends on factors such as page count, index size, and Fill Factor.
When you rebuild an index, you can include a WITH
clause after the REBUILD
keyword. In the WITH
clause, you can specify one or more options that define how the index will be rebuilt. For example, you can define a Fill Factor, specify whether to recompute the distribution statistics, or specify whether row locks are allowed. In the following ALTER INDEX
statement, I include a WITH
clause that defines the FILLFACTOR
and ONLINE
options:
1 2 3 4 5 6 |
ALTER INDEX IX_StoreContact_ContactTypeID ON AdventureWorks.Sales.StoreContact REBUILD WITH ( FILLFACTOR = 70, ONLINE = ON ) |
In this case, the FILLFACTOR
option is set to 70, which determines the percentage of how much the leaf level of each index page will be filled during index creation or alteration. As a result, the leaf level will be filled to only 70%. The ONLINE
option is set to ON, which means that the StoreContact
table and IX_StoreContact_ContactTypeID
index will be available for queries and data modification during the index operation. For a complete list of all the options available to the WITH
clause, see the topic “ALTER INDEX
(Transact-SQL)"
in SQL Server Books Online.
After I ran the statement above and rebuilt the IX_StoreContact_ContactTypeID
index, I retrieved the statistics on the index, as shown in Table 6.
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
|
33.3333333333333 |
2 |
1.5 |
3 |
Table 6: Statistics on the IX_StoreContact_ContactTypeID
Index after Rebuilding
If you compare these results to those in Table 2, you’ll see that the percentage of fragmentation dropped from 50% to 33.3% and the number of pages increased from 2 to 3 as a result of changing the FILLFACTOR
. (The default Fill Factor is 100%.) There are still two fragments, but now the average is 1.5 pages per fragment, rather than 1.
Another method you can use to rebuild an index is the CREATE INDEX
statement. To use this statement, you must create an index by the same name as the original and include the DROP_EXISTING
option in the WITH
clause, as shown in the following example:
1 2 3 4 5 6 7 |
CREATE UNIQUE NONCLUSTERED INDEX AK_StoreContact_rowguid ON AdventureWorks.Sales.StoreContact (rowguid ASC) WITH ( DROP_EXISTING = ON, FILLFACTOR = 70, ONLINE = ON ) ON [PRIMARY] |
In this case, I’m rebuilding the AK_StoreContact_rowguid
index, which is a unique, nonclustered index. Notice that the DROP_EXISTING option is set to ON, which means that the original index will be dropped and a new one created. Notice also that I’ve included the FILLFACTOR
and ONLINE
options in the WITH
clause. For a complete list of all the options available to the WITH
clause, see the topic "CREATE INDEX (Transact-SQL)"
in SQL Server Books Online.
Now take a look at Table 7, which shows the index statistics that I retrieved after I rebuilt the index.
IndexName |
PercentFragment |
TotalFrags |
PagesPerFrag |
NumPages |
|
25 |
2 |
2 |
4 |
Table 7: Statistics on the AK_StoreContact_rowguid
Index after Re-Creating Index
Once again, if you compare these results to Table 2, you’ll see that I reduced the percentage of fragmentation from 66.7% to 25%, reduced the number of fragments from 3 to 2, and increased the pages per fragment from 1 to 2 (in part because I reduced the Fill Factor to 70%).
Using the CREATE INDEX
statement rather than ALTER INDEX
statement to rebuild an index has several advantages. The CREATE INDEX
statement lets you add or remove key columns, change column order, change the sort order, repartition a partitioned index, and move the index to a different filegroup. However, the advantage to using the ALTER INDEX
statement is that you can rebuild more than one index in a single transaction.
Note: You can also drop an index and then use the CREATE INDEX
statement to re-create that index. However, Microsoft recommends against this approach because the index is offline during the drop and re-creation operations. Also, if the transaction fails, the index is not re-created.
Regardless of the approach you use to rebuild an index-or whether you rebuild or reorganize an index-the important point to remember is that you must first properly analyze your indexes, and that’s where the sys.dm_db_index_physical_stats
function comes in. The function provides valuable details about all your indexes, and then, from this information, you can determine how to proceed. The actual reorganizing and rebuilding are very straightforward. You can even automate these processes based on the statistics returned by the function. Be sure to refer to SQL Server Books Online as necessary for details about all these operations.
Load comments